Euclid
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Kang, Minki, Jeong, Jongwon, Lee, Seanie, Cho, Jaewoong, Hwang, Sung Ju
Large language models (LLMs) excel at complex reasoning tasks but remain computationally expensive, limiting their practical deployment. To address this, recent works have focused on distilling reasoning capabilities into smaller language models (sLMs) using chain-of-thought (CoT) traces from teacher LLMs. However, this approach struggles in scenarios requiring rare factual knowledge or precise computation, where sLMs often hallucinate due to limited capability. In this work, we propose Agent Distillation, a framework for transferring not only reasoning capability but full task-solving behavior from LLM-based agents into sLMs with retrieval and code tools. We improve agent distillation along two complementary axes: (1) we introduce a prompting method called first-thought prefix to enhance the quality of teacher-generated trajectories; and (2) we propose a self-consistent action generation for improving test-time robustness of small agents. We evaluate our method on eight reasoning tasks across factual and mathematical domains, covering both in-domain and out-of-domain generalization. Our results show that sLMs as small as 0.5B, 1.5B, 3B parameters can achieve performance competitive with next-tier larger 1.5B, 3B, 7B models fine-tuned using CoT distillation, demonstrating the potential of agent distillation for building practical, tool-using small agents. Our code is available at https://github.com/Nardien/agent-distillation.
- North America > United States > Ohio > Cuyahoga County > Euclid (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > Florida > Miami-Dade County > Miami (0.14)
- (18 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Education (1.00)
- Information Technology (0.67)
Simulating Realistic Post-Stroke Reaching Kinematics with Generative Adversarial Networks
Hadley, Aaron J., Pulliam, Christopher L.
The generalizability of machine learning (ML) models for wearable monitoring in stroke rehabilitation is often constrained by the limited scale and heterogeneity of available data. Data augmentation addresses this challenge by adding computationally derived data to real data to enrich the variability represented in the training set. Traditional augmentation methods, such as rotation, permutation, and time-warping, have shown some benefits in improving classifier performance, but often fail to produce realistic training examples. This study employs Conditional Generative Adversarial Networks (cGANs) to create synthetic kinematic data from a publicly available dataset, closely mimicking the experimentally measured reaching movements of stroke survivors. This approach not only captures the complex temporal dynamics and common movement patterns after stroke, but also significantly enhances the training dataset. By training deep learning models on both synthetic and experimental data, we achieved a substantial enhancement in task classification accuracy: models incorporating synthetic data attained an overall accuracy of 80.2%, significantly higher than the 63.1% seen in models trained solely with real data. These improvements allow for more precise task classification, offering clinicians the potential to monitor patient progress more accurately and tailor rehabilitation interventions more effectively.
- Europe > Switzerland > Basel-City > Basel (0.05)
- North America > United States > Ohio > Cuyahoga County > South Euclid (0.04)
- North America > United States > Ohio > Cuyahoga County > Euclid (0.04)
- (4 more...)
- Research Report > Experimental Study (0.68)
- Research Report > New Finding (0.46)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.90)